Mst-based Semi-supervised Clustering Using M-labeled Objects
نویسندگان
چکیده
Most of the existing semi-supervised clustering algorithms depend on pairwise constraints, and they usually use lots of priori knowledge to improve their accuracies. In this paper, we use another semi-supervised method called label propagation to help detect clusters. We propose two new semi-supervised algorithms named K-SSMST and M-SSMST. Both of them aim to discover clusters of diverse density and arbitrary shape. Based on Minimum Spanning Tree’s algorithm variant, K-SSMST can automatically find natural clusters in a dataset by using K labeled data objects where K is the number of clusters. M-SSMST can detect new clusters with insufficient semi-supervised information. Our algorithms have been tested on various artificial and UCI datasets. The results demonstrate that the algorithm’s accuracy is better than other supervised and semi-supervised approaches.
منابع مشابه
Model Selection for Semi-Supervised Clustering
Although there is a large and growing literature that tackles the semi-supervised clustering problem (i.e., using some labeled objects or cluster-guiding constraints like “must-link” or “cannot-link”), the evaluation of semi-supervised clustering approaches has rarely been discussed. The application of cross-validation techniques, for example, is far from straightforward in the semi-supervised ...
متن کاملSemi-supervised Clustering on Heterogeneous Information Networks
Semi-supervised clustering on information networks combines both the labeled and unlabeled data sets with an aim to improve the clustering performance. However, the existing semi-supervised clustering methods are all designed for homogeneous networks and do not deal with heterogeneous ones. In this work, we propose a semi-supervised clustering approach to analyze heterogeneous information netwo...
متن کاملRobust Method for E-Maximization and Hierarchical Clustering of Image Classification
We developed a new semi-supervised EM-like algorithm that is given the set of objects present in eachtraining image, but does not know which regions correspond to which objects. We have tested thealgorithm on a dataset of 860 hand-labeled color images using only color and texture features, and theresults show that our EM variant is able to break the symmetry in the initial solution. We compared...
متن کاملClustering Analysis for Semi-supervised Learning Improves Classification Performance of Digital Pathology
Purpose: Completely labeled datasets of pathology slides are often difficult and time consuming to obtain. Semi-supervised learning methods are able to learn reliable models from small number of labeled instances and large quantities of unlabeled data. In this paper, we explored the potential of clustering analysis for semi-supervised support vector machine (SVM) classifier. Method: A clusterin...
متن کاملIncremental multi-class semi-supervised clustering regularized by Kalman filtering
This paper introduces an on-line semi-supervised learning algorithm formulated as a regularized kernel spectral clustering (KSC) approach. We consider the case where new data arrive sequentially but only a small fraction of it is labeled. The available labeled data act as prototypes and help to improve the performance of the algorithm to estimate the labels of the unlabeled data points. We adop...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012